


Dragon Age: By the Numbers

by Scuffin_MacGuffin



Category: Dragon Age - All Media Types
Genre: Charts!, Data!, Fandom Analysis, Graphs!, Meta, oh my!
Language: English
Status: In-Progress
Published: 2014-05-02
Updated: 2014-05-02
Packaged: 2018-01-21 10:06:31
Rating: General Audiences
Warnings: No Archive Warnings Apply
Chapters: 3
Words: 5,702
Publisher: archiveofourown.org
Story URL: https://archiveofourown.org/works/1546856
Author URL: https://archiveofourown.org/users/Scuffin_MacGuffin/pseuds/Scuffin_MacGuffin
Summary: <blockquote class="userstuff">
              <p>This work is a data-driven analysis of the "Dragon Age - All Media Types" tag on Archive of Our Own. It contains all sorts of graphs and statistics about body of work created by Dragon Age fans on Ao3, made using data I've collected about pairing categories, wordcount, ratings, tags, and hitcount (to name just a few of my variables). I point to interesting trends, relationships, and tidbits in the data in as friendly and engaging a way as possible. If I have succeeded, then this work is hopefully quite interesting to anyone curious about the Dragon Age fandom.</p>
            </blockquote>





	1. Introduction

**Summary for the Chapter:**

> In which I tell you what this project is all about, provide brief justification for its existence, and furnish readers with a table of contents.

**Notes for the Chapter:**

>   1. While I recommend reading the introduction, it is not strictly necessary. It is _definitely_ not necessary to read the chapters in order. Please look at the table of contents (found at the bottom of the [first chapter](http://archiveofourown.org/works/1546856/chapters/3276539)) and click around to whatever topic interests you the most.
>   2. Quick summaries of each chapter are available at the end of the chapter, in case you don't want to read the whole thing.
>   3. Assume that all graphs and statistics you find in this piece are current to the date the chapter in which they are contained was last updated.
>   4. Please keep in mind that this is an evolving project, and I am eager to take into account feedback both in going forward with it and in perfecting what I already have. I encourage readers to leave comments here or get in touch with me [on tumblr](http://darktownbakery.tumblr.com/) with any criticisms, suggestions, or questions.
> 


+

For the past few months I have been working on _Dragon Age: By The Numbers._ I became interested in doing a data-driven analysis of fandom trends by way of my interest in the debates that float around fandom, as I tend to be very opinionated in these debates myself. There are all sorts of fascinating questions which simply beg for answers -- is slash really fetishized? Is it that everybody hates Anders or is it that everybody loves Anders? Is it fair to call the Dragon Age fandom sexist? Why doesn't femslash get more love? 

These are not questions that can be answered solely through data, but I think that data can provide useful insight to their investigation. It can at least provide some substantiation of the claims that we make while arguing with one another. And, if done right, it may well bring up new questions.

To these ends, I have set about gathering, interpreting, and visualizing data about the fan works produced by the Dragon Age fandom. This is an ongoing process, and I am not nearly done yet. I doubt I ever will be. What I can do is post my results as I go along, and to make my methods as transparent and open to collaboration as possible. 

My data source for this project is the “Dragon Age - All Media Types” tag on Archive of Our Own. I gathered this data using an app called [kimono](http://www.kimonolabs.com/), with which I was able to obtain a dataset thousands of rows long containing all sorts of information about each story found in the Dragon Age tag. This dataset is up to date as of April 15, 2014. 

It is important to note that this dataset is not perfect. Ao3 is likely not representative of the Dragon Age fandom; I chose it as a source because of its seeming popularity among tumblr users (who I see as my primary audience) and because it is the website that is most easily compatible with kimono. I imagine that gathering data from [Fanfiction.net](https://www.google.com/url?q=https%3A%2F%2Fwww.fanfiction.net%2Fgame%2FDragon-Age%2F&sa=D&sntz=1&usg=AFQjCNEC8AGUS22X-oSZpu2py3Y25yhvXA) or the [Dragon Age Kink Meme](http://www.google.com/url?q=http%3A%2F%2Fdragonage-kink.livejournal.com%2F&sa=D&sntz=1&usg=AFQjCNGf8oxJBTMpTVDjuWiLKKgfh7EozA) would yield different and interesting results, but gathering that data would be a trying process (perhaps one day I will atempt it anyway). Additionally, collecting data with kimono is a process that inevitably creates some inaccuracies. I don’t feel that these inaccuracies are severe enough to invalidate my conclusions, but you can check out a full explanation of my process in [Chapter 2](http://archiveofourown.org/works/1546856/chapters/3282578) and decide that for yourself. 

As I do work on this project, I will continue to post my results as new chapters in this work, and I will continue to update the Table of Contents in this introductory chapter as the work evolves. If you want to follow the progress of this project, the easiest way is probably to subscribe to this work. You could also [follow the blog I help run on tumblr](http://darktownbakery.tumblr.com/), or you could track the "dragon age by the numbers" tag on tumblr.

If you would like to see/play with the raw data yourself, please leave me a comment or drop me an ask on tumblr and I’ll be glad to share it with you. Also definitely hit me up if you know anything about data analysis/coding, because I have been teaching myself all this stuff myself and would definitely welcome assistance! 

Many thanks to my dear firstblush, who taught me how to use Excel, merged many a csv file for me, and provided assistance in doing calculations. Many thanks also to my kind teachers, who advised me on how to best collect and present the data, and helped me immensely with cleaning it. 

And now that that talking's all over with, here is my table of contents, in which you will find a brief summary of every chapter in addition to its title, so that you can easily find what interests you the most. Happy reading, and may you find this analysis intriguing.

+

TABLE OF CONTENTS

**Chapter 1: Introduction** \- In which I tell you what this project is all about, provide brief justification for its existence, and furnish readers with a table of contents. (Hint: You're here right now!) 

**[Chapter 2: Methodology for collecting, cleaning, and formatting the data](http://archiveofourown.org/works/1546856/chapters/3282578)** \- In which I explain more thoroughly exactly how I obtained this data and how I finagled it into a useable state. This chapter exists for transparency's sake and is not super interesting in and of itself. Skip it unless you find methodology super interesting, you want to copy my process to round up some of your own data, or you are skeptical about the validity of my data/conclusions and want to take a look at exactly how the sausage got made. 

**[Chapter 3: What does DA fandom like to write?](http://archiveofourown.org/works/1546856/chapters/3292136)** \- In which I investigate which fan works are most popular in terms of number of _works_ written and number of _words_ written. Basically, I seek to answer the question of what kinds of works are most commonly uploaded. The analysis in this chapter is conducted mostly through the lens of gendered relationship categories (F/F, F/M, M/M, Gen, Multi, Other, No Category).

+


	2. Methodology for collecting, cleaning & formatting the data

**Summary for the Chapter:**

> In which I explain more thoroughly exactly how I obtained this data and how I finagled it into a useable state. This chapter exists for transparency's sake and is not super interesting in and of itself. Skip it unless you find methodology super interesting, you want to copy my process to round up some of your own data, or you are skeptical about the validity of my data/conclusions and want to take a look at exactly how the sausage got made.

**Notes for the Chapter:**

>   1. While I recommend reading the introduction, it is not strictly necessary. It is _definitely_ not necessary to read the chapters in order. Please look at the table of contents (found at the bottom of the [first chapter](http://archiveofourown.org/works/1546856/chapters/3276539)) and click around to whatever topic interests you the most.
>   2. Quick summaries of each chapter are available at the end of the chapter, in case you don't want to read the whole thing.
>   3. Assume that all graphs and statistics you find in this piece are current to the date the chapter in which they are contained was last updated.
>   4. Please keep in mind that this is an evolving project, and I am eager to take into account feedback both in going forward with it and in perfecting what I already have. I encourage readers to leave comments here or get in touch with me [on tumblr](http://darktownbakery.tumblr.com/) with any criticisms, suggestions, or questions.
> 


+

**INTRO**

Collecting, cleaning, and formatting the data has been by far the lengthiest part of this whole process. My own inexpertise doubtless lengthened it, but I think regardless of familiarity it would probably be a headache for anyone. There was plenty of room in this process for both machine and human error. I have tried to minimize both, but in the interests of transparency (and in the interests of anyone else who wants to try) I am outlining my procedure here. 

**STEP 1: CHOOSING A SOURCE/THE NATURE OF THE DATA**

I chose Archive of Our Own as my source for two reasons. First, it seems to be a popular choice among tumblr users, who I see as my primary audience. Second, Ao3 has the most thorough crop of data available, much more so than other fanfiction archives that I know of. With a story of my own as an example, here is the information I was able to get about each story in the "Dragon Age - All Media Types" tag:

Or, to spell it out, for each story in the archive I have:

  * Title
  * Author
  * Date Updated
  * Wordcount
  * Chapters
  * Number of Kudos
  * Number of Comments
  * Number of Bookmarks
  * Hitcount
  * Rating (General Audiences, Teen, Mature, Explicit, Not Rated)
  * Relationship Category (F/F, F/M, M/M, Gen, Multi, Other, No Category)
  * Archive Warnings (Graphic Depictions of Violence, Major Character Death, Rape/Non-con, Underage Sex)
  * Complete/Incomplete Status
  * Fandom ("Dragon Age" "Dragon Age - All Media Types" "Dragon Age: Origins" "Dragon Age II" etc.)
  * Tags (including pairing tags, character tags, and miscellaneous tags)



This is, as far as I know, the most extensive dataset existing about fan works. And I do mean existing. As in, everywhere, not just for the Dragon Age fandom, and not just for Ao3. I have looked for similar datasets and haven’t found any, not among fandom folk on the internet and not in the academic world. My search skills may be lacking, so if you know of any such dataset, please do point me to it. I have searched and searched and come up bereft.

My guess as to why this is is because getting data like this is _really wildly obnoxiously hard._ None of the fanfiction hosting sites make their data available to the public: Ao3 explained to me, when I requested it, that doing so would be more of a technical, time-consuming hassle than their volunteers are able to accommodate. Fanfiction.net never got back to me at all. 

Because of this, the process of collecting, cleaning, and formatting the data has been a trial and a half, and I only managed to figure it out with a lot of kind help. Hopefully one day Ao3 will release their data to the public. Until then, this is what we’re stuck with. 

**STEP 2:COLLECTING THE DATA WITH KIMONO**

[kimono](http://www.kimonolabs.com/) is a free (up to a certain level) service that allows the user to collect data from a website. It definitely has some limitations, which I will go through, but it is also AMAZING in that it lets someone who doesn’t know how to code (read: me) gather data that would otherwise be completely inaccessible. The third reason I chose Ao3 as a source actually has to do with it being the easiest of the fanfiction archives for kimono to collect data from. Since I want to be transparent, and since I want others to be able to do this, I’m going to go really step-by-step through how to use kimono. Skip to the next section if you have no interest. 

So first, I sign up for kimono and add their little bookmarklet to my bookmarks bar, as one is instructed to do when one signs up for kimono. Then I get started. I go to the ["Dragon Age - All Media Types"](http://archiveofourown.org/works/search?utf8=%E2%9C%93&work_search%5Bquery%5D=&work_search%5Btitle%5D=&work_search%5Bcreator%5D=&work_search%5Brevised_at%5D=&work_search%5Bcomplete%5D=0&work_search%5Bsingle_chapter%5D=0&work_search%5Bword_count%5D=&work_search%5Blanguage_id%5D=&work_search%5Bfandom_names%5D=Dragon+Age+-+All+Media+Types&work_search%5Brating_ids%5D=&work_search%5Bcharacter_names%5D=&work_search%5Brelationship_names%5D=&work_search%5Bfreeform_names%5D=&work_search%5Bhits%5D=&work_search%5Bkudos_count%5D=&work_search%5Bcomments_count%5D=&work_search%5Bbookmarks_count%5D=&work_search%5Bsort_column%5D=&work_search%5Bsort_direction%5D=&commit=Search) tag and I clicked my little bookmark, and I get something that looked like this:

(You’ll notice that this is not actually the "Dragon Age - All Media Types" tag but rather a search done with that tag and also with my pseud entered in Author/Artist field. This is because I didn’t want to throw anyone else’s works up here without their input. Obviously, if you were doing this for real, you’d leave the Author/Artist field blank). 

Okay so what you see up there that is relevant is the little yellow bubble, plus sign next to the little yellow bubble, the field that says "property1" that is waiting for a real name, and the little bubble with an open book inside it.

What kimono is waiting for you to do is teach it what kind of data you want it to grab. So let’s say the first thing I wanna get is the title. What I’m gonna do is go down and click on the title of one of the stories listed.

As you can see, it’s highlighted the title that I clicked on in yellow, just like the color of the bubble. There is also a "1" in the bubble now -- this indicates that kimono has picked up one piece of confirmed data. 

Notice also other elements on the page highlighted in less stroking shades of yellow with an "x" and a "✓". This is kimono asking me if these elements are the same sorts of thing as the title I clicked on, and if it should also pick those up. 

For the title of the second work, I’m going to hit the "✓" to indicate that yes, this is also a title, and yes, I would also like you to grab this. For the authors’ names that kimono has highlighted, I’m going to hit "x" to tell it that no, these are not titles and I do not want you to grab them. Finally, I am going to rename my little yellow bubble "Title" instead of "property1" so that I can remember easily what it is.

Now all the titles are highlighted. You’ll notice my little yellow bubble now says "10" -- this is how I know it’s gotten everything I want, because there are 10 works on this page in total, and the number in the bubble indicates that kimono has picked up 10 pieces of data. kimono has successfully learned what a title looks like. 

To get more data (Author, Rating, Relationship Category, etc.), I’m going to click the "+" bubble next to my first little yellow bubble. It will give me new bubbles with new colors, and I can teach kimono how to recognize whatever new categories I please.

Problems do, of course, arise with kimono. It can’t always learn exactly what you want to teach it. For example, kimono doesn’t understand the difference between the values on the bottom. If I want to grab them (which I did), I have to tell it to get everything and then sort it out later. 

When I have everything that I think I want to get, the last step is to teach kimono how to deal with multiple pages. Since I’m dealing with an archive that has multiple pages, I need to teach kimono how to hit the "Next" button so it can collect data from all of those pages. I click the little bubble to the right with the open book inside of it. This is the pagination button. After I click the pagination button, I click on the "Next" button. Kimono now knows how to flip through pages!

Next, I hit the "Done" button. I create a name for my API (that’s what the thing is that I just made with kimono), set the crawl time to "On demand" and set the maximum number of pages that I want it to click through to get me data.

Then I hit "Create API". 

The last thing I need to do to get my dataset is to get kimono to execute a crawl for me. On my API Details page on kimono, I need to navigate to the "Pagination/Crawling" tab and hit "Star crawl now".

For lots of pages, the crawl will take a while, up to a few hours (if it’s working properly, anyway). Once it finishes up, I can click on "csv" and download the file. I should be able to open it up in Excel, google spreadsheets, or another spreadsheet program. These files are going to be very large (thousands of rows deep, hundreds of columns wide), so some programs might have trouble handling it. I have had best luck with Excel, myself. 

And so... that’s it. That’s how I got my data. A few additional notes: 

First of all, kimono limits you in the amount of data you can get. This is only going to work for small fandoms where the the number of works is relatively small, because kimono will only grab 1000 pages maximum. Also, though they don’t say this, kimono seems to only be willing to give you 2500 rows of data maximum per crawl. This means that if you are trying to get data for a tag with more than 2500 works in it you have to execute multiple crawls. I did this for the "Dragon Age - All Media Types" tag by separating the tag into three categories based on wordcount ([less than 1139 words](http://archiveofourown.org/works/search?utf8=%E2%9C%93&work_search%5Bquery%5D=&work_search%5Btitle%5D=&work_search%5Bcreator%5D=&work_search%5Brevised_at%5D=&work_search%5Bcomplete%5D=0&work_search%5Bsingle_chapter%5D=0&work_search%5Bword_count%5D=%3C1139&work_search%5Blanguage_id%5D=&work_search%5Bfandom_names%5D=Dragon+Age+-+All+Media+Types&work_search%5Brating_ids%5D=&work_search%5Bcharacter_names%5D=&work_search%5Brelationship_names%5D=&work_search%5Bfreeform_names%5D=&work_search%5Bhits%5D=&work_search%5Bkudos_count%5D=&work_search%5Bcomments_count%5D=&work_search%5Bbookmarks_count%5D=&work_search%5Bsort_column%5D=&work_search%5Bsort_direction%5D=&commit=Search), [between 1140 and 3500 words](http://archiveofourown.org/works/search?utf8=%E2%9C%93&work_search%5Bquery%5D=&work_search%5Btitle%5D=&work_search%5Bcreator%5D=&work_search%5Brevised_at%5D=&work_search%5Bcomplete%5D=0&work_search%5Bsingle_chapter%5D=0&work_search%5Bword_count%5D=1140-3500&work_search%5Blanguage_id%5D=&work_search%5Bfandom_names%5D=Dragon+Age+-+All+Media+Types&work_search%5Brating_ids%5D=&work_search%5Bcharacter_names%5D=&work_search%5Brelationship_names%5D=&work_search%5Bfreeform_names%5D=&work_search%5Bhits%5D=&work_search%5Bkudos_count%5D=&work_search%5Bcomments_count%5D=&work_search%5Bbookmarks_count%5D=&work_search%5Bsort_column%5D=&work_search%5Bsort_direction%5D=&commit=Search), [greater than 3500 words](http://archiveofourown.org/works/search?utf8=%E2%9C%93&work_search%5Bquery%5D=&work_search%5Btitle%5D=&work_search%5Bcreator%5D=&work_search%5Brevised_at%5D=&work_search%5Bcomplete%5D=0&work_search%5Bsingle_chapter%5D=0&work_search%5Bword_count%5D=%3E3500&work_search%5Blanguage_id%5D=&work_search%5Bfandom_names%5D=Dragon+Age+-+All+Media+Types&work_search%5Brating_ids%5D=&work_search%5Bcharacter_names%5D=&work_search%5Brelationship_names%5D=&work_search%5Bfreeform_names%5D=&work_search%5Bhits%5D=&work_search%5Bkudos_count%5D=&work_search%5Bcomments_count%5D=&work_search%5Bbookmarks_count%5D=&work_search%5Bsort_column%5D=&work_search%5Bsort_direction%5D=&commit=Search)), each of which had less than 2500 works at the time when I executed the crawls. 

Second, with the way that the formatting of the csv files that kimono outputs works out, you really, really don’t want to get data for the fandoms and tags in the same file as data for everything else. It makes the formatting practically impossible. So I had to do separate, additional crawls to get data on the fandoms and on the tags. 

To get all the data I wanted, I executed nine crawls in total. Which leads me to the problem of...

 **STEP 3: CLEANING AND FORMATTING THE DATA**

This is the longest and most obnoxious step, because it feels like it is never done. There is always a new way that you could fiddle with your dataset to make it possible for you to turn it into something interesting. Because of the ongoing nature of this process, I am not going to be as thorough here as I was in explaining the use of kimono. I am also only going to address what I did to work with the files containing data for everything except the fandoms and the tags. This is because the fandoms and the tags files have been a lot more complicated to format, and because I haven’t done any substantial work with them yet. When I do, I will update this chapter. 

Until then, here is a basic outline of what I did with the raw data from kimono. 

So, upon downloading a csv file from kimono, you get something that looks like this:

Basically, a total mess. 

The first thing to do is merge the files. For the files not containing data for the fandoms or for the tags, this was pretty simple. I simply cut and paste the data from one file into another until it was all consolidated. The total number of works I had data for at this point was **7,439,** which was very close to the total number of works Ao3 indicated were contained in the "Dragon Age - All Media Types" tag as of the time I downloaded the dataset. 

This changed after I took the step of [deleting any duplicate works](http://office.microsoft.com/en-us/excel-help/delete-duplicate-rows-from-a-list-in-excel-HA001034626.aspx) within the spreadsheet (i.e., works that, for whatever reason, where showing up twice in my data). After this step, the total number of works I had was **5,847**.

I don’t know why there were so many duplicates in the dataset. I don’t know whether these duplicates already existed in Ao3’s archive or whether they were somehow added by kimono, even as kimono failed to capture data for some unknown number of works. Anything is possible. Since I have no way of knowing, I can only hope that whatever works were lost were fairly representative and do not do too much to skew the data. In the future I may do more to try and fix this problem, but I have already spent a fair amount of time trying, and have come up with no solutions. Since this data is the best it’s possible to get, I decided to use it and base my conclusions off of it. I do feel confident that what I have still paints an accurate picture of the "Dragon Age - All Media Types" tag, but as I said -- anything is possible. 

After merging the data, I cleaned it up so as to make it easier to work with. I deleted all of the .href columns (any element I was data element I was collecting which doubled as a link generated two columns: one with the text or value and one with the URL that the link led to) except for Title.href, which was useful as a unique identifier for every work. I also created new columns for Bookmarks, Comments, Kudos, and Hits, so that these values were no longer scattered in different columns. This was accomplished over about twelve hours using the copy-and-paste method alongside the alphabetical sort feature in Excel. I spot-checked my work and I believe that all the values I copied are accurate, but it is possible that I made a mistake somewhere. (There is without doubt a better way to accomplish this step with coding. Unfortunately, I know it not.) 

Finally, I had to deal with the problem of it being possible to tag a work with more than one value (a work can contain both F/F and Gen relationship categories, for example, or a work might warn for both Major Character Death and Graphic Depictions of Violence). In practice, this makes a single column for showing relationship categories or archive warnings difficult to work with. Instead, I made new columns for each category or warning I was interested in (one column for F/F, one column for Gen, etc.) and populated it with a "1" if the work contained that value at all, even in combination with other values, and a "0" if the work did not contain that value. In this way, I was able to get data for all works tagged with a certain value as well as for only works that were tagged with a certain value. 

My final(ish) dataset looks more like this:

I am happy to share this data in both raw and final forms, so please get in touch with me if you’d like it. 

**STEP 4: ??????**

Reading through all of that might have been perplexing -- I myself remain perplexed. I hope that it at least gave a bit of insight on how I obtained this data and how you might go about obtaining your own. As I post new chapters containing data wrangled in new ways, I will continue to update this chapter to reflect new methodologies. I do hope one day to turn it into a full-blown tutorial. Please do get in touch with me if you have questions about anything, and thanks for reading through.

**SUMMARY OF FINDINGS**

  * Working with data is difficult.
  * Good Lord, is it difficult.
  * I really need to learn how to code.



+


	3. What does DA fandom like to write?

**Summary for the Chapter:**

> In which I investigate which fan works are most popular in terms of number of _works_ written and number of _words_ written. Basically, I seek to answer the question of what kinds of works are most commonly uploaded. The analysis in this chapter is conducted mostly through the lens of gendered category pairings (F/F, F/M, M/M, Gen, Multi, Other, No Category).

**Notes for the Chapter:**

>   1. While I recommend reading the introduction, it is not strictly necessary. It is _definitely_ not necessary to read the chapters in order. Please look at the table of contents (found at the bottom of the [first chapter](http://archiveofourown.org/works/1546856/chapters/3276539)) and click around to whatever topic interests you the most.
>   2. Quick summaries of each chapter are available at the end of the chapter, in case you don't want to read the whole thing.
>   3. Assume that all graphs and statistics you find in this piece are current to the date the chapter in which they are contained was last updated.
>   4. Please keep in mind that this is an evolving project, and I am eager to take into account feedback both in going forward with it and in perfecting what I already have. I encourage readers to leave comments here or get in touch with me [on tumblr](http://darktownbakery.tumblr.com/) with any criticisms, suggestions, or questions.
> 


+

**WHAT DOES DA FANDOM LIKE TO WRITE?**

When we’re talking about popularity on Ao3, we’re actually talking about two things:

  1. What’s popular to write (as measured by the number of works and the wordcount), and
  2. What’s popular to read (as measured by the number of comments, kudos, bookmarks, and hits).



In this chapter, I’m going to be talking about what’s popular to write. In Chapter 4, I will be talking about what’s popular to read. I am basing my analyses for both chapters largely on Ao3’s relationship categories, which are:

  * F/F ("femslash")
  * F/M ("het")
  * M/M ("slash")
  * Gen
  * Multi
  * Other
  * No Category



The reason for this is because, first of all, how these categories compare is of a lot of interest to me personally. It also seems that an analysis focusing on these categories could shed a lot of light on questions about why the popularities of femslash, het, and slash break down the way they do and whether or not that breakdown is problematic. 

**IN GENERAL**

So the first thing I can tell you is that, as far as writing goes within the DA fandom, **het** is the most popular category.

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/3).

In the graph above you can see the number of the number of works for each relationship category in the "Dragon Age - All Media Types" tag on Ao3. You’ll notice that there are two bars for each relationship category: the dark blue bar shows every work tagged with that relationship (a work can be tagged with more than one) and the light blue bar shows works tagged only with that relationship. 

It is interesting to find that het is the most popular category of written works on Ao3; this runs counter to the prevailing trend on Ao3, in which slash dominates. In fact, the number of M/M on the archive right now, according to [Ao3’s search feature](http://archiveofourown.org/works/search), is about double the number of F/M fics. Also interesting: while works tagged F/F make up about **5%** of the archive’s works (again, as deduced using [Ao3’s search feature](http://archiveofourown.org/works/search)), works tagged F/F make up about **7%** of my dataset. A small increase, but an increase nonetheless. 

I speculate that the percentage of _Dragon Age_ works featuring women is greater than in general for Ao3 because of the much greater number of women in the Dragon Age games. Ao3’s [most popular fandoms](https://docs.google.com/spreadsheet/ccc?key=0Ar9EDxcJk-BRdExRNVpiSHE5SHhsYklENnA0WVcxU2c#gid=4) are largely for shows such as Supernatural or Sherlock which do not feature many female characters. However, that in spite of the more equal gender distribution of the _Dragon Age_ franchise, the number of F/F works is still quite small. If female characters were getting an amount of proportionate to the number of female characters that exist in the _Dragon Age_ games, then we could expect the number of works featuring femslash to be only slightly less than the number of works featuring slash. Since this is not the case, there must be some other factors at play.

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/5).

The above graph is much the same as the one before it, but it uses total wordcount as a measure of popularity instead of total number of works (i.e., I took the sum of the wordcount of all works for every relationship category). You see the same pattern as you do in the graph showing number of works, though with certain differences made more dramatic. The Gen category has shrunk, because though there are many Gen works, these works tend to be short -- especially those only tagged with Gen. The Multi category has grown a little because though there are relatively few Multi works, they tend to be longer. The F/M category has grown relative to the M/M and F/F categories, indicating that either F/M works tend to be longer on the whole, or that there are more really long outliers in the F/M category (i.e., works greater than 100k words long). 

I do want to briefly point to the difference between the All Tagged and Only Tagged categories. As you probably know, if you read fanfiction, stories often have a primary pairing, which gets most of the author’s attention, and one or more background pairings. Say a story is tagged both F/F and F/M -- I have no way to tell, from my dataset, which of those is a primary pairing and which of those is a background pairing. But comparing the total number of every story tagged with a relationship category vs. the number of stories tagged only with that relationship category can give me a good proxy. 

What you find, looking at these graphs, is that a higher percentage of works tagged with slash and het are tagged with only slash and het than of femslash works tagged only with femslash. In terms of number of works, **75%** of F/M stories are only tagged with F/M, and **76%** of M/M stories are tagged only with M/M. The primary pairing in these works is obviously F/M and obviously M/M, as it could be nothing else. However, only **57%** of F/F stories are tagged only with F/F. The same trend occurs in the realm of wordcount, only with more exaggeration -- **61%** of the words written in the F/M category and **59%** of the words written in the M/M category are tagged only with F/M or with M/M. However, only **32%** of the words written in the F/F category are tagged only with F/F. 

This indicates to me that femslash is used more often as a background pairing than are het or slash. It could also indicate that femslash stories are more likely to have background pairings than other kinds of stories, but that seems to me to be a less likely explanation. I can’t think of any reason why the second explanation might make sense, whereas I have already established that F/F stories are less popular. It seems reasonable that their lower popularity might account for more works in the femslash category actually primarily featuring a different category. If this is true, it would mean that femslash fans have even less to read compared to other fans than it first appears. 

**BY RATING**

I was curious as to whether the trends I explored above could be broken down further. So I decided to separate each relationship category into the five possible ratings: **General Audiences** , **Teen Audiences And Up** , **Mature** , **Explicit** , and **Not Rated.**

Here are the results for all works tagged with each category:

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/8), though it lacks the scaling of the columns on the version above because I did the scaling in a separate program.

And here are the results for works tagged only with each category:

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/7), though it lacks the scaling of the columns on the version above because I did the scaling in a separate program.

The columns on these two graphs are scaled in proportion to the relative size of each category as determined by number of works (i.e., the width of the columns in these graphs corresponds to the height of the bars on the "Number of Works per Relationship Category" chart further up the page). Within each column, the size of the colored portions show what percentage of works in each category was rated Explicit, Mature, Teen, etc. 

For both the het and femslash categories, the greatest number of works are rated Teen And Up Audiences. This is true for both all works categorized as F/F or F/M and works only categorized as F/F or F/M. Only **20%** of all works categorized as F/F are rated Explicit, and only **17% >** of works categorized as only F/F are rated Explicit. **22%** of all works categorized as F/M are rated Explicit, and this percentage is the same for works categorized as only F/M. 

For slash works, however, the greatest number of works are not rated Teen And Up Audiences, but are rather rated Explicit. **31%** of all works categorized as M/M are rated Explicit, and **32%** of works categorized only as M/M are rated Explicit. 

It seems that when writing fanfiction, authors in the Dragon Age fandom are more likely to make their slash fic explicit than their het fic or their femslash fic. Does this indicate that slash is indeed fetishized, at least to some extent? Does this mean that authors are less comfortable exploring women’s sexuality? The data can’t actually answer these questions; these are merely thoughts that occurred to me when looking at the results. It is true that the "Dragon Age - All Media Types" tag is mostly a mirror of general Ao3 trends: overall, **18%** of F/F works, **16%** of F/M works, and **26%** of M/M works are rated Explicit (as determined using [Ao3’s search feature](http://archiveofourown.org/works/search)). 

**BY WORDCOUNT**

The majority of works in the "Dragon Age - All Media Types" tag are under 2,000 words long. On the chart below, you can see how many short works there are compared to how many long works -- and keep in mind that the x-axis only goes up to 10,000 words! If I were to show you the full distribution of wordcounts, most works would be represented as a skinny, skinny little line to the very left to the chart (in fact, let me [show you that](https://plot.ly/~Scuffin_MacGuffin/11), just for kicks). I included only complete works in order to prevent works that are going to become longer from skewing the data.

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/10), lacking color, as I added color in a different program.

Because the wordcounts were too varied to really be useful in making interesting graphs, I wanted to split the wordcounts up into categories. I chose these categories partly based on my trying to get a roughly equal number of works in each, and partly based on my sensibilities as a writers (i.e., how it makes sense to me to sort my own work on the basis of length). You can see the categories I chose in the above graph, represented by color -- though note that the purple color is really 5,001 to 20,000 words, not 5,001 to 10,001 words, and that there is an additional category for works over 20,000 words long. 

Using these categories, we can look at how the relationship categories fall along wordcount. Here they are for _all_ works tagged with a particular relationship category:

And here they are for works tagged _only_ with a particular relationship category:

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/12).

Live version of this graph is [here](https://plot.ly/~Scuffin_MacGuffin/13).

The lines on these graphs represent how much of each wordcount category is made up of the relationship category that that line represents. Following the purple F/M line on the first of the two graphs, for example, you can see that a little over 30% of works that are under 500 words long are categorized as F/M. Following the line to the end of the chart, you can see that around 60% of works greater than 20,000 words long are categorized as F/M. 

The most immediately obvious conclusion we can draw from looking at these two graphs is that the longest works tend to favor romance. The longer a work gets, the less likely it it is to be Gen. Conversely, the longer a work is, the more likely it is to be femslash, het, or slash -- look at how the shape of the F/F, F/M, and M/M lines mirror each other, in spite of their differing locations. The upward trend for works containing these three relationships, is the same. 

Looking at the difference between the graph showing the percentages for all works tagged with a particular relationship category vs. the graph showing percentages for works tagged only with a particular relationship category, one difference we can see is that while all works tagged with F/F, F/M, and M/M tend to rise towards the right of their graph (first graph), works tagged only with F/F, F/M, or M/M plateau towards the right of their graph (second graph). This indicates that the longer a work gets, the more likely it is to be tagged with more than one relationship category. The more prolific the work, the more kind of relationships it contains. 

**SUMMARY OF FINDINGS**

In looking at what kinds of works are popular to write in the "Dragon Age - All Media Types" tag, my findings indicate that:

  * Of the gendered relationship categories, het is most commonly written. Slash is the next most common. Femslash is quite small in comparison to het and slash.
  * The number of works in the "Dragon Age - All Media Types" tag which feature a relationship category containing women is much greater than is the norm for Ao3.
  * It is likely that femslash relationships function as background pairings more often than het or slash relationships.
  * The most common rating for works categorized as F/F or F/M is Teen And Up Audiences. The most common rating for works categorized as M/M is Explicit. This seems to be typical for works archived on Ao3.
  * The majority of works in the "Dragon Age - All Media Types" tag are under 2,000 words long.
  * The longer a work is, the more likely it is to contain a romantic relationship such (F/F, F/M, or M/M) and the less likely it is to be Gen.




End file.
